Constrained Submodular Minimization for Missing Labels and Class Imbalance in Multi-label Learning
نویسندگان
چکیده
In multi-label learning, there are two main challenges: missing labels and class imbalance (CIB). The former assumes that only a partial set of labels are provided for each training instance while other labels are missing. CIB is observed from two perspectives: first, the number of negative labels of each instance is much larger than its positive labels; second, the rate of positive instances (i.e. the number of positive instances divided by the total number of instances) of different classes are significantly different. Both missing labels and CIB lead to significant performance degradation. In this work, we propose a new method to handle these two challenges simultaneously. We formulate the problem as a constrained submodular minimization that is composed of a submodular objective function that encourages label consistency and smoothness, as well as, class cardinality bound constraints to handle class imbalance. We further present a convex approximation based on the Lovasz extension of submodular functions, leading to a linear program, which can be efficiently solved by the alternative direction method of multipliers (ADMM). Experimental results on several benchmark datasets demonstrate the improved performance of our method over several state-of-the-art methods.
منابع مشابه
Constrained Submodular Minimization Towards Missing Labels and Class Imbalance in Multi-label Learning
Copyright c © 2016, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. edges among instances EX for each class (building connections between the entries within the same row, for every row of Z); last, we copy the edges among classes EC for each instance (building connections between the entries within the same column, for every column of Z). di 6= ∑m...
متن کاملExploiting Associations between Class Labels in Multi-label Classification
Multi-label classification has many applications in the text categorization, biology and medical diagnosis, in which multiple class labels can be assigned to each training instance simultaneously. As it is often the case that there are relationships between the labels, extracting the existing relationships between the labels and taking advantage of them during the training or prediction phases ...
متن کاملTowards Class-Imbalance Aware Multi-Label Learning
In multi-label learning, each object is represented by a single instance while associated with a set of class labels. Due to the huge (exponential) number of possible label sets for prediction, existing approaches mainly focus on how to exploit label correlations to facilitate the learning process. Nevertheless, an intrinsic characteristic of learning from multi-label data, i.e. the widely-exis...
متن کاملLarge-scale Multi-label Learning with Missing Labels
The multi-label classification problem has generated significant interest in recent years. However, existing approaches do not adequately address two key challenges: (a) scaling up to problems with a large number (say millions) of labels, and (b) handling data with missing labels. In this paper, we directly address both these problems by studying the multi-label problem in a generic empirical r...
متن کاملSubmodular Multi-Label Learning
In this paper we present an algorithm to learn a multi-label classifier which attempts at directly optimising the F -score. The key novelty of our formulation is that we explicitly allow for assortative (submodular) pairwise label interactions, i.e., we can leverage the co-ocurrence of pairs of labels in order to improve the quality of prediction. Prediction in this model consists of minimising...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016